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ABSTRACT The Middle East respiratory syndrome coronavirus (MERS-CoV) causes a severe acute respiratory tract infection 
with a high fatality rate in humans. Coronaviruses are capable of infecting multiple species and can evolve rapidly through re¬ 
combination events. Here, we report the complete genomic sequence analysis of a MERS-CoV strain imported to China from 
South Korea. The imported virus, provisionally named ChinaGDOl, belongs to group 3 in clade B in the whole-genome phyloge¬ 
netic tree and also has a similar tree topology structure in the open reading frame la and -b (ORFlab) gene segment but clusters 
with group 5 of clade B in the tree constructed using the S gene. Genetic recombination analysis and lineage-specific single¬ 
nucleotide polymorphism (SNP) comparison suggest that the imported virus is a recombinant comprising group 3 and group 5 
elements. The time-resolved phylogenetic estimation indicates that the recombination event likely occurred in the second half of 
2014. Genetic recombination events between group 3 and group 5 of clade B may have implications for the transmissibility of the 
virus. 

IMPORTANCE The recent outbreak of MERS-CoV in South Korea has attracted global media attention due to the speed of spread 
and onward transmission. Here, we present the complete genome of the first imported MERS-CoV case in China and demon¬ 
strate genetic recombination events between group 3 and group 5 of clade B that may have implications for the transmissibility 
of MERS-CoV. 


Received 28 July 2015 Accepted 30 July 2015 Published 8 September 2015 

Citation Wang Y, Liu D, Shi W, Lu R, Wang W, Zhao Y, Deng Y, Zhou W, Ren H, Wu J, Wang Y, Wu G, Gao GF, Tan W. 2015. Origin and possible genetic recombination of the 
Middle East respiratory syndrome coronavirus from the first imported case in China: phylogenetics and coalescence analysis. mBio6(5):e01280-15. doi:10.1128/mBio.OI 280-15. 
Editor Michael J. Buchmeier, University of California, Irvine 

Copyright © 2015 Wang et al. This is an open-access article distributed under the terms of the Creative Commons Attribution-Noncommercial-ShareAlike 3.0 Unported 
license, which permits unrestricted noncommercial use, distribution, and reproduction in any medium, provided the original author and source are credited. 

Address correspondence to George F. Gao, gaofu@chinacdc.cn, or Wenjie Tan, tanwj28@l 63.com. 

This article is a direct contribution from a Fellow of the American Academy of Microbiology. 


M iddle East respiratory syndrome coronavirus (MERS- 
CoV), first detected in the Kingdom of Saudi Arabia 
(KSA) in 2012, causes severe acute respiratory tract infection in 
humans, with a high case fatality rate (CFR) (1-4). Dromedary 
camels are believed to be important reservoir hosts or vectors 
for human infection; bats may also be implicated (5-8). As of 
17 July 2015, 1,368 laboratory-confirmed cases of human in¬ 
fection with MERS-CoV had been reported to the World 
Health Organization (WHO), including at least 490 deaths, 
corresponding to a CFR as high as 35.45% (9). Recent MERS 
clusters in South Korea are thought to be the largest outbreak 
outside the Middle East countries (10). As of 25 July 2015, 186 
laboratory-confirmed cases of MERS-CoV infection have been 
confirmed (including 36 deaths) in South Korea (9). A South 
Korean man who was a relative of some of the laboratory- 
confirmed cases traveled to Guangdong Province (10) and was 


diagnosed as the first imported MERS-CoV case in China by 
molecular detection of MERS-CoV (11, 12). The rapid spread 
of disease in South Korea raised concerns that the imported 
virus had evolved to become more transmissible. Here, we re¬ 
port a comprehensive phylogenetic analysis of the complete 
MERS-CoV genome sequence of the first Chinese imported 
case of MERS (ChinaGDOl), and the results indicate its prob¬ 
able origin and show evidence of genetic recombination. 

RESULTS 

Patient and sample history. The current outbreak in South Korea 
and China was initiated when a 68-year-old Korean man flew back 
to Seoul on 4 May 2015 after a visit to four Middle East countries 
(Bahrain, United Arab Emirates, Saudi Arabia, and Qatar). On 
26 May 2015, a 44-year-old South Korean man presented with 
fever to a hospital in Guangdong. He was in close contact with the 
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FIG 1 Timeline of the travel history, potential virus exposure, onset of disease, and diagnosis of the first imported MERS-CoV case in China. UAE, United Arab 
Emirates; KSA, Kingdom of Saudi Arabia. 





FIG 2 Phylogenetic relationships based on complete genomes (A), ORFlab genes (B), and S genes (C) of MERS-CoV strains. China’s first imported MERS-CoV strain 
(GenBank accession no. KT006149.2), South Korea’s first MERS-CoV strain (GenBank accession no. KT029139), and the latest MERS-CoV strains prevalent in the 
Middle East (GenBank accession no. KT026453 to KT026456) are indicated in red. The MERS-CoV strains derived from camels are indicated in blue. All of the complete 
genomes were analyzed by nucleotide sequence alignment using the maximum-likelihood method implemented in the RAxML. Numbers at the nodes indicate 
bootstrap support for each node (percentage of 1,000 bootstrap replicates). Scale bars indicate the expected number of nucleotide substitutions per site. 
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index patient in South Korea on 16 May 2015 (Fig. 1), as well as a 
suspected second-generation patient. The timeline of the travel 
history, potential virus exposure, onset of disease, and diagnosis of 
the first imported MERS-CoV case in China are presented in 
Fig. 1. 

Characterization of genome. With informed consent and the 
approval of the ethical committee of the National Institute of Viral 
Disease Control and Prevention, China Center for Disease Con¬ 
trol and Prevention (CDC), nasopharyngeal swabs were collected 
and used for RNA extraction, followed by reverse transcription 
PCR and genome sequencing. Through both Sanger and Ion 
Torrent sequencing, the full-length virus genome (30,144 bp) of 
ChinaGDOl was obtained and deposited in GenBank (accession 
no. KT006149). Over 2,000,000 paired-end reads were quality 
trimmed and processed to remove human genome sequences. 
Nonhuman reads were assembled into contigs by CLC Genomic 
Workbench and aligned against representative sequences of 
MERS-CoV. No nucleotide insertions or deletions were observed 
in the genome. 

The genome sequence of this virus, referred to as ChinaGDOl, 
had high levels of nucleotide identity (99.33% to 99.79%) to pre¬ 
viously published MERS-CoV genomes (Fig. 2), with 99.31% to 
99.78% sequence identity in the open reading frame la and -b 
(ORFlab) gene segment and 98.91% to 99.60% identity in the S 
gene. The E, M, and N genes had 98.93% to 100% identity with 
previously described MERS-CoV strains. In total, ChinaGDOl 
possessed 11 nonsynonymous nucleotide substitutions (Table 1), 
which occurred in the ORFlab (n = 8), ORF3 (n = 1), ORF4b 
(n = 1), and M (n = 1) genes, respectively (Table 1). Although 
there were five nucleotide substitutions in the S gene, no amino 
acid change was discovered. Of note, in comparison with previ¬ 
ously published MERS-CoV genomes, the ChinaGDOl genome 
shows 11 unique amino acid substitutions, and 8 of them were 
shared with the newly released South Korean strains and the latest 
strains prevalent in Saudi Arabia (Table 1). 

Phylogenetic analysis. To further investigate the genetic re¬ 
lationship between ChinaGDOl and other MERS-CoV strains 
whose genomes are available, we performed phylogenetic anal¬ 
yses using the complete genome, the ORFlab gene, and the S 
gene. From the whole-genome phylogeny, all available MERS- 
CoV strains can be clustered into two clades, the earlier clade A 
and the more recent clade B (Fig. 2A). ChinaGDOl fell into 
group 3 of clade B (Fig. 2A). Within group 3, ChinaGDOl and 
the South Korean and Saudi Arabian strains from 2015 were 
closely clustered and formed a long branch, separate from oth¬ 
ers of group 3. The nearest strain to this branch was Hafr-Al- 
Batin-1-2013 (GenBank accession no. KF600628), isolated in 
August 2013. Phylogenetic analysis of the ORFlab gene indicated 
a similar topology in which ChinaGDOl and the recent MERS- 
CoV strains identified in South Korea were closely adjacent to 
Hafr-Al-Batin-1-2013 in group 3 (Fig. 2B). Flowever, the phylog¬ 
eny of the S gene differed in that the new viruses fell into group 5 
and were closely related to viruses from both humans and drom¬ 
edaries (Fig. 2C). These findings are consistent with recombina¬ 
tion, a phenomenon not uncommon in coronaviruses. 

Genetic recombination analysis. To examine whether ge¬ 
netic recombination has occurred in ChinaGDOl, we per¬ 
formed bootscanning analyses. We compared ChinaGDOl with 
representative viruses from group 3 (Hafr-Al-Batin-1-2013; 
GenBank accession no. KF600628), group 5 (KSA-CAMEL-378; 


TABLE 1 Comparison of sites of variation between gene sequences of 
ChinaGDOl, the first South Korean strain, the latest Saudi Arabia 
strains, and other MERS-CoV strains 0 
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a The positions of amino acid substitutions are indicated by boldface. China, imported 
MERS-CoV strain ChinaGDOl (GenBank accession no. KT006149); South Korea, 
South Korea’s first MERS-CoV strain (GenBank accession no. KT029139); Saudi 
Arabia, MERS-CoV strains recently identified in Saudi Arabia (GenBank accession no. 
KT026453 to KT026456); Others, other MERS-CoV strains detected worldwide; Nt, 
nucleotide; Aa, amino acid. 


GenBank accession no. KJ713296), and group 1 (Abu Dhabi_ 
UAE_9_2013; GenBank accession no. KP209312) as controls. As 
shown in Fig. 3A, ChinaGDOl was more similar to the group 3 
strain from position 1 to 15,000 and more similar to the group 5 
strain from approximately position 18,000 to 24,000. We then 
compared the single-nucleotide polymorphisms (SNPs) of 
ChinaGDOl with consensus sequences of group 3 and group 5 
(Fig. 3B; see also Fig. SI and S2 in the supplemental material). 
There were 78 SNPs discovered along the ChinaGDOl genome 
(Fig. 3B). Whereas before position 17,206, ChinaGDO 1 ’s SNP pat¬ 
tern is nearly identical to that of the group 3 viruses, its SNP 
pattern is more similar to that of group 5 viruses between posi¬ 
tions 17,311 and 23,804. The consistency in the results of 
bootscanning and SNP analyses supports the hypothesis that the 
gene segment from approximately position 17,300 to 24,000, rep¬ 
resenting portions of the ORFlab and S genes, reflects a recombi¬ 
nation event (Fig. 3B). 

Phylogenetic analysis was further performed using BEAST 
with the complete genome, the nonrecombinant region (positions 
1 to 17,300), and the potential recombinant region (positions 
17,301 to 24,000), respectively (Fig. 4). The phylogenies revealed 
by the BEAST trees were consistent with those from the 
maximum-likelihood trees. In the trees constructed using the 
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FIG 3 Recombination analyses of complete MERS-CoV genomes. (A) Bootscanning analysis ofMERS-CoV genome. The ChinaGDOl strain was used as a query 
sequence and compared with one strain from group 3 (GenBank accession no. KF600628.1), one from group 5 (GenBank accession no. KJ713296.1), and one 
from group 1 (GenBank accession no. KP209312.1). (B) Single-nucleotide differences between the ChinaGDOl sequence and consensus sequences of group 3 and 
group 5. Group 3 cons, consensus sequences of group 3 strains; group 5 cons, consensus sequences of group 5 strains; South Korea, first MERS-CoV strain 
(GenBank accession no. KT029139) in South Korea; KSA-2015, latest strains prevalent in Saudi Arabia (GenBank accession no. KT026453 to KT026456), 
Bisha-1/2012, an earlier strain used as a control. 


complete genome and the nonrecombinant region, ChinaGDOl 
fell within group 3; however, trees constructed using the recom¬ 
binant region clustered with the group 5 sequences. 

To date the recombination event, we estimated the time to 
most recent common ancestor for the novel MERS-CoV from 
2015. Although there was a slight difference among results 
from different models, the time to most recent common ances¬ 
tor of the 2015 cluster was estimated to be between 0.5 and 
0.7 years before the identification of the imported case in the 
latter months of 2014 (Table 2). Given the observation of sim¬ 
ilar recombination events in the newly released South Korean 
strains and the latest strains prevalent in Saudi Arabia, the 
travel histories of patients, and potential opportunities for vi¬ 
rus exposure, we surmise that the recombination likely oc¬ 
curred in the Arabian Peninsula. 

DISCUSSION 

Over the past 3 years, MERS-CoV infections have continued to 
increase, posing a serious threat to global public health. Previous 


studies have revealed that MERS-CoV infections are likely due to 
repeated introductions ofMERS-CoV from dromedary camels to 
humans (13-15), resulting in only limited human-to-human 
transmission (16). However, the large number of second- and 
third-generation cases in South Korea raised concerns that MERS- 
CoV may have evolved to become more adapted to human-to- 
human transmission. 

Our results indicate that at the whole-genome level, 
ChinaGDOl is >99% similar to the previously identified MERS- 
CoV strains. Phylogenetic analysis based on the whole-genome 
sequence revealed that it belongs to group 3 of clade B MERS-CoV 
strains and forms a separate small branch with viruses from South 
Korea and Saudi Arabia from 2015. Different phylogenies were 
observed in the trees constructed using the full-length genome 
and the S gene, indicating the possibility of a recombination event. 
Further evidence of a recombination event was obtained through 
bootscanning and SNP analyses. BEAST analysis revealed that it 
might have occurred recently, in the second half of 2014, in the 
Middle East. 
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FIG 4 Time-resolved phylogenetic analyses of complete genomes (A), nonrecombination regions (B), and recombination regions (C) of MERS-CoV strains 
using BEAST. The nonrecombination region is approximately bp 1 to 17,300, and the recombination region is approximately bp 17,301 to 24,000. ChinaGDOl 
(GenBank accession no. KT006149), South Korea’s first MERS-CoV strain (GenBank accession no. KT029139), and the latest strains prevalent in Saudi Arabia 
(GenBank accession no. KT026453 to KT026456) are indicated in red. 


Genetic recombination has been well established in severe 
acute respiratory syndrome coronavirus (SARS-CoV) (17, 18); 
however, there is only one report of genetic recombination in 
MERS-CoV (19). Dudas and Rambaut point to frequent re¬ 
combination in MERS-CoV and partition the genome into two 
parts in which nucleotides 1 to 23,722 and nucleotides 23,723 
to 30,126 have independent molecular clock rates. Based on the 
latest genome sequences from South Korea and the Kingdom of 
Saudi Arabia, our research indicated that a novel type of genetic 
recombination has occurred in the MERS-CoV strains preva¬ 
lent in South Korea. We note that six MERS-CoV isolates from 
2015 (ChinaGDOl, the first MERS-CoV strain from South Ko¬ 
rea, and the four latest strains from Saudi Arabia) had high 
levels of nucleotide identity (99.90% to 99.96%) and showed 
the same recombination signal in our analyses. We speculate 
that they arose from a common recombination event. How¬ 
ever, more studies are needed to understand the relationship 
between genetic recombination of MERS-CoV, the biological 
properties it conveys, and its relevance to the recent high rate of 
transmission. 

MATERIALS AND METHODS 

Full-length genomic sequencing. Nasopharyngeal swabs from the South 
Korean patient diagnosed with MERS-CoV infection were collected and 


used for viral RNA extraction with the QIAamp viral RNA minikit. Forty- 
four sets of specific primer pairs were designed and used to amplify the 
complete genome, followed by Sanger sequencing; meanwhile, the ex¬ 
tracted viral RNA was also used for next-generation sequencing with the 
Ion Torrent PGM after random amplification. 

Phylogenetic analysis. We downloaded all (;i = 92) available full- 
length genome sequences of MERS-CoV from GenBank and used RAxML 
(20) for phylogenetic analyses of the complete genome, the ORFlab gene, 
and the S gene, respectively. One thousand bootstrap replicates were run. 
Furthermore, the Bayesian Markov chain Monte Carlo method, imple¬ 
mented in BEAST (21), was used to estimate the time to the most recent 
common ancestor. Twelve different model combinations were applied. 
For all the analyses, we used the general time-reversible nucleotide sub¬ 
stitution model with gamma-distributed rate heterogeneity. Bayesian 
Markov chain Monte Carlo analysis was run for 50 million steps. Trees 
and parameters were sampled every 5,000 steps, with the first 10% re¬ 
moved as burn-in. 

Genetic recombinant analysis. Similarity plots and bootscan¬ 
ning analysis were generated by SimPlot (22); a sliding window of 200 
nucleotides was used, moving in 20-nucleotide steps. Single¬ 
nucleotide-difference analysis was used to confirm the recombination 
event. 

Nucleotide sequence accession number. The full-length virus ge¬ 
nome (30,144 bp) of ChinaGDOl was deposited in GenBank under acces¬ 
sion no. KT006149. 


TABLE 2 Estimated times in years to most recent common ancestor of the 2015 cluster 



Value for indicated coalescent model [mean (95% Cl)] 



Molecular clock model 

Constant size 

Exponential growth 

Logistic growth 

GMRF Bayesian skyride" 

Strict clock 

Exponential relaxed clock 
Lognormal relaxed clock 

0.6594 (0.4259, 0.9137) 
0.6008 (0.3586, 0.8944) 
0.6394 (0.4049, 0.9247) 

0.6046 (0.371, 0.8562) 
0.5208 (0.3339, 0.7868) 
0.582 (0.3684, 0.8437) 

0.6621 (0.4167, 0.9201) 
0.6037 (0.3578, 0.9059) 
0.641 (0.4085, 0.9167) 

0.6007 (0.3611,0.8396) 
0.5199 (0.3141,0.8026) 
0.5711 (0.3351,0.8264) 


" GMRF, Gauss Markov random fields. 
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SUPPLEMENTAL MATERIAL 

Supplemental material for this article may be found at http://mbio.asm.org/ 
lookup/suppl/ doi: 10.1128/mBio.01280-15/-/DCSupplemental. 

Figure SI, PDF file, 0.5 MB. 

Figure S2, PDF file, 0.4 MB. 
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